The impact of modifier genes on cone-rod dystrophy heterogeneity: An explorative familial pilot study and a hypothesis on neurotransmission impairment

Cone-rod dystrophies (CORDs) are a heterogeneous group of inherited retinopathies (IRDs) with more than 30 already known disease-causing genes. Uncertain phenotypes and extended range of intra- and interfamilial heterogenicity make still difficult to determine a precise genotype-phenotype correlation. Here, we used a next-generation sequencing approach to study a Sicilian family with a suspected form of CORD. Affected family members underwent ophthalmological examinations and a proband, blind from 50 years, underwent whole genome and exome sequencing. Variant analysis was enriched by pathway analysis and relevant variants were, then, investigated in other family members and in 100 healthy controls from Messina. CORD diagnosis with an intricate pattern of symptoms was confirmed by ophthalmological examinations. A total of about 50,000 variants were identified in both proband’s genome and exome. All affected family members presented specific genotypes mainly determined by mutated GUCY2D gene, and different phenotypical traits, mainly related to focus and color perception. Thus, we looked for possible modifier genes. According to relationship with GUCY2D, predicted functional effects, eye localization, and ocular disease affinity, only 9 variants, carried by 6 genes (CACNG8, PAX2, RXRG, CCDC175, PDE4DIP and LTF), survived the filtering. These genes encode key proteins involved in cone development and survival, and retina neurotransmission. Among analyzed variants, CACNG8c.*6819A>T and the new CCDC175 c.76C>T showed extremely low frequency in the control group, suggesting a key role on disease phenotypes. Such discovery could enforce the role of modifier genes into CORD onset/progression, contributing to improve diagnostic test towards a better personalized medicine.

Introduction Cone-rod dystrophies (CORDs) are one of the most common and heterogeneous inherited ocular diseases (IRDs) characterized by progressive retinal degeneration [1]. CORD estimated incidence ranges from 1 in 20,000-100,000, with a prevalence of 1/40.000 in Europe. It also represents one of the most common cause of legal blindness in school-aged [2]. No data is available on CORD frequency in Sicily. Degeneration could involve both eyes and it is characterized by the primary loss of cone photoreceptors, mainly due to phototransduction cascade impairments, often followed by rod involvement. CORD patients usually manifest the loss of central vision, photophobia and color vision alterations. Furthermore, due to frequent rod degeneration, early nyctalopia could be also showed by affected individuals [1].
More than 30 genes have already been associated with CORDs. These genes are mainly involved in phototransduction and related biochemical pathways, such as outer segment (OS) morphogenesis, intraflagellar transport and neurotransmitter release. Most of their causative mutations are inherited by an autosomal recessive pattern, while dominant forms are mainly related to mutations in GUCY2D, PRPH2, CRX, GUCA1A, PROM1 genes and Xlinked ones to mutations in RPGR, CACNA1F, OPN1LW and OPN1MW genes [3]. However, many other CORD causative/associated genes are still unidentified. Moreover, although the functions of some of the already known genes have been studied, it is still difficult to determine a precise genotype-phenotype correlation in CORDs, because the most of genes are associated to uncertain and/or not unique phenotypes. The existence of multigenic inheritance patterns and modifier genes makes molecular diagnoses difficult, because of phenotypic variability due to complex genetic architectures underpinning the various forms of CORDs. Locus and allelic heterogeneities limit the effectiveness of diagnostic targeted strategies, as many pathogenic variants go undetected [4]. Hence, sequencing all known genes implicated in CORDs simultaneously seems the best current approach for their definitive molecular diagnosis, and next-generation sequencing (NGS) is currently considered a powerful tool for mutation screening [5]. With about 30-60% of previously unrecognized CORD cases already characterized [6], this high-throughput method represents a promising tool for detecting novel disease-causing genes and genotype-phenotype correlation, which may lead to substantial improvement of our understanding of allele pathogenicity, protein function, and population genetics.
Here we describe a Sicilian family came to our attention as a suspected case of CORD. All affected individuals manifested different symptoms related to ocular impairment. Thus, the proband and her two sons and one daughter were screened using whole exome sequencing (WES) and whole genome sequencing (WGS), leading to the identification of guanylate cyclase 2D (GUCY2D) as causative gene, and of 6 novel candidate modifier genes carrying 9 previously undescribed variants.

Ion proton (WGS) and Illumina (WES) sequencing
Genomic DNA was extracted from peripheral blood using QIAamp 1 DNA Mini Kit (Qiagen) then quantified by NanoPhotometer P-330 (Implen). DNA integrity was evaluated by visual inspection on a 1% agarose gel.
Single-end libraries for WGS were generated after fragmentation by the Ion Xpress™ Plus Fragment Library Kit (Thermo Fisher Scientific, Waltham, MA, USA) and clonal amplification by Ion PI™ Template OT2 200 Kit v3 at Ion OneTouch™ 2 Instrument (Thermo Fisher Scientific, Waltham, MA, USA). Then, three Ion PI Chip v.2 were run on an Ion Proton™ System (Thermo Fisher Scientific, Waltham, MA, USA), using the Ion PI Sequencing 200 Kit v2.
Paired-end libraries for WES were generated by the SureSelect XT HS Reagent Kit (V6) (Agilent, Santa Clara, CA, USA) kit and sequenced on a NovaSeq 6000 Illumina platform.
Generated sequences were processed in-house using the bioinformatics software CLC Genomics Workbench 22.0.0 [7]. The analytic pipeline foresaw the alignment to the GRCh38 Human Reference Genome (<3 mismatches/100 bp per alignment), followed by duplicate read removal, InDel realignment and Base Recalibration, before variant calling. Variant calling using the fixed ploidy variant detection was performed, reporting variants with >90% probability. Pyro-error variants were removed through specific filtering (parameters set: "in homopolymer regions with minimum length = 3"; "with frequency below = 0.8"). The algorithm behind the Fixed Ploidy Variant Caller combines a Bayesian model (examining posterior probabilities) with a maximum likelihood approach.
Finally, found variants were annotated by ANNOVAR v.20220330 tool and included databases [8].

Filtering candidate variants
Firstly, variants from both exome and genome sequencing of proband were matched and merged. The first criterion we used to select candidate variants was to determine if newly identified variants had already been associated to CORD or other Inherited Retinal Diseases (IRDs) and their frequency, thanks to explored reference databases included in ANNOVAR (ClinVar, dbSNP, 1000 genomes, HapMap, dbnsfp30a), along with HGMD Professional [9] and RetNet (https://sph.uth.edu/retnet/) databases.
Then, variants were analyzed to assess the potential impact on encoded gene function, evaluating: 1) Coding region variants; 2) Variants potentially involved in splice site alterations; 3) Regulative variants (3' and 5' UTR); 4) Variants with high conservation score. A comparison of exome-derived variants between all analyzed exomes was also realized.

Splicing variant effect prediction
To analyze the possible effects of identified splicing variants, the Alternative Splice Site Predictor (ASSP) tool was exploited. ASSP predicts putative alternative exon isoform, cryptic, and constitutive splice sites of coding exons. Coding sequences are generally characterized by trinucleotide frequencies which differ markedly from the frequencies observed in non-coding sequences. Codon usage reflects the probability of a sequence to be coding by comparing the observed trinucleotide frequencies with those observed in coding sequences. Codon usage values above zero indicate, that a subsequence is probably coding, while sequence showing values below zero are probably not coding (ASSP calculates log-likelihood values). Codon usage is calculated for a sliding window of a given size . Codon usage and stop codons are calculated  for all three possible reading frames (F1-frame 1, F2-frame 2, F3-frame 3). Stop codons should usually not be observed within exons read in the corresponding frame. Splice sites, especially constitutive ones, are usually detected near the putative boundaries of exons, as being indicated by high codon usage values.

Sanger validation and healthypopulation screening
Variants filtered in family affected members were validated by Sanger sequencing. Primers were designed by Primer3 tool [10], following standard settings and, then, carefully manually checked.
Amplification of gene fragments with selected variants was performed using primer pairs designed according to corresponding nucleotide sequences in GenBank. The Polymerase Chain Reaction (PCR) mix was prepared adding 1μg of genomic DNA to 50μl reaction mixture containing a 0.2 μm concentration of each primer and 1 U ACCUZYME™ DNA Polymerase (Bioline). After an initial denaturation step at 95˚C for 1 min, the samples were subjected to 35 cycles of amplification consisting of 15 sec of denaturation at 95˚C, 10 sec of annealing. The annealing temperature was optimized for each primer set. Following PCR, 5 μl of amplified product was examined by electrophoresis on a 1% agarose gel. Sanger sequencing was, then, performed using the BigDye Terminator© v3.1 Cycle Sequencing Kit chemistry and run on a 3130xl Genetic Analyzer (Applied Biosystems, Thermo Fisher Scientific). Primer list and PCR thermal conditions are available upon request.
Then, confirmed variants were searched in healthy family individuals and in a group of 100 healthy subjects from Messina, in order to validate the analyzed variants as disease related.

Cluster analysis by pathway functional enrichment
To determine if different observed clinical phenotypes could be influenced by the combined effects of more than one variant in modifier genes, a cluster analysis by pathway enrichment was realized. This was carried out by Cytoscape (v3.8.2) [11] and its plugins GeneMANIA (v3.5.2) [12], ClueGO (v2.5.7) and CluePedia (v1.5.7) [13]. Subcellular locations, biological processes, molecular function, and KEGG pathways, which were inferred from electronic annotation and experimental data, were all in the identified GO categories. A minimum level of 3 and a maximum level of 8 were set as the GO level interval with a minimum of two genes per category. Our experimental data were directly compared and enriched with publicly available information from STRING [14], IntAct [15], MiMI [16], miRBase [17], and miRecords [18]. miRNAs target site analysis was performed using the PolymiRTS Database (v3.0) [19]. To investigate the complex phenotype-genotype patterns, and due to the relatively low sample size investigated, we adopted a qualitative approach: for each genotype v, we clustered patients showing a given symptom s from those who did not. By considering an additive model (1 = wild-type, 2 = heterozygous, 3 = homozygous mutated), which implies that the risk conferred by an allele is increased 1r-fold for heterozygotes and 2r-fold for homozygotes, we extracted and averaged genotype from each group and calculated the difference d v s . A positive sign indicates that affected patients showed a genotype more directed towards homozygous mutation if compared to unaffected patients; vice versa, a negative sign indicates that unaffected patients showed a genotype preferentially directed towards heterozygous mutation. Positive values may therefore appear in variants involved in symptom manifestation, whereas negative differences may highlight variants whose presence may have a protective role in symptom arousal.

Data mining and statistical analysis
A weight p v s was subsequently created to better define the magnitude of the difference d v s : p v s was assigned a value of 0 when abs(d v s ) < 0.5, alternatively a value of 1 was considered when 0.5 � abs (d v s ) < 1. If abs � 1, p v s was set to 2. By gathering the weights obtained from each genotype v and symptom s, we therefore obtained a matrix vector P s,v = [p v s ], with row elements s representing symptom s, while columns v represent scores for genotypes. The usefulness of the matrix consists in the fact that by summing over the v-th column we are able to obtain a score SC v pol whose magnitude is proportional to its relevance in clustering both affected and unaffected patients given the whole patterns of symptoms.

Clinical analysis of family affected members evidenced a very complex pattern of pathological phenotypes
The Sicilian family showed a dominant inheritance pattern, with a possible incomplete penetrance in the last generation (Fig 1), where the III1 subject, who is apparently healthy, could be too young to develop symptomatology.
The most of clinical features of the four affected family members were typical of those associated to cone-rod dystrophy, such as night blindness from birth, photophobia, generalized dyschromatopsia and progressive loss of visual field and visual acuity, undetectable ERG and macula with prominent atrophic macular lesion (also called "macular colobomas"), even if many aspects remain unclear and different throughout the family patients ( Table 1).
The proband became completely blind at 50 years of age, and fundus examination showed tapetal retinal degeneration, macular pigmentary dystrophy, pale disc and retinal pigmented epithelium atrophy (Fig 2). OCT showed a relevant ellipsoid band loss in outer retina, choroid thickness and a thinning of retinal pigment epithelium (RPE) in both eyes. Additionally, ERGs revealed a generalized cone dysfunction with rod involvement (photopic and scotopic extinct ERGs), with compromised visual response (the tracings showed no response). Clinical examination results of the other affected member are shown in S1-S3 Figs. The remaining four unaffected individuals, instead, showed normal phenotype in all ophthalmological examinations.

NGS data analyses identified possible CORD causative and modifiers genes
WES was initially performed on the proband, then on her 2 sons and 1 daughter. A mean value of 92,806,962 paired-end reads (mean length~150 bp) were produced, considering all exomes. About 92.14% of these reads overcame the average quality score (Phred score) of 30, permitting to detect a mean of 43,921 variants, of which 38,678 single nucleotide variants (SNVs) and 3,941 insertions or deletions (InDels). Subsequently, a total of 172,372,728 single end reads (mean read length~150 bp) were generated for proband's whole genome. This time, an inferior percentage of reads (~73%) were characterized by a Phred score � 30, permitting us to identify a total of 20,306 variants, including 17,501 SNVs and 2,805 InDels. About the 80% of coding sequence variants found in WGS was also detected in proband's exome. A graphical workflow of the whole down-stream data analysis is represented in Fig 3. Using RetNet and Human Genetic Mutation Database (HGMD) Professional, and analyzing the distribution through the affected family members, we found that the causative variant probably responsible of the CORD onset could be the variant c.2513G>C in GUCY2D gene.
According to a preliminary filtering based on relationship with GUCY2D, functional effects, eye localization and ocular disease affinity, 32 candidate variants, distributed within 10 candidate modifier genes, were classified as new missense and UTR mutations (Group 1); missense, nonsense, and inframe SNPs (Group 2); splice-site and UTR regulation site SNPs (Group 3); and synonymous SNPs (Group 4). Detailed features of such variants and their distribution throughout the family are shown in Table 2.    I2 II1 II2 II4 I1 II3 II5 III1   CAUSATIVE

Pathway functional enrichment permitted to filter out nine final candidate variants carried by six potential modifiers genes
The cluster analysis by pathway functional enrichment conducted by Cytoscape and its Gene-MANIA plug-in revealed that relevant genetic and physical interactions connect almost the entire set of analyzed genes, suggesting a common involvement in multiple pathways (S4 Fig). Then, basing on eye and vision relationship computed by ClueGO, identified pathways were subcategorized into 87 main hierarchically structured GO classifications, including 59 biological processes, 13 cellular components, 4 immune system processes, and 11 molecular functions (Fig 6). Two KEGG pathways, 12 Reactome pathways, and 2 WikiPathways emerged from analysis (S2 Table). The overall results indicated that the identified genes belonging to these GO categories may play the most important roles in regulation of eye/retina development, homeostasis and functionality, especially in synaptic activities and ciliary transport.
Furthermore, the CluePedia enrichment analysis highlighted ninety-three terms connected by 247 edges (activation, binding, catalysis, expression, inhibition, ptmod, reaction) with the By CluePedia Cerebral plug-in layout [37] it was also possible to establish that most of the protein encoded by the 8 genes are located in intracellular compartments, with only one (CCDC175) that could be extracellular and one (CACNG8) that could act at the plasma membrane level (S6 Fig). By using the mIRANDA database we showed that from 2 to 9 miRNAs could regulate gene expression of each analyzed gene (S7 Fig). These data are supported by the PolymiRTS Database, which indicates that the presence of CACNG8 c. � 6819A>T and PDE4-DIP c. � 39G>A could create or disrupt several miRNAs target sites (Table 3).
Finally, 9 variants carried by 8 candidate modifiers genes were filtered out and, soon after, validated by Sanger sequencing (Fig 7).

Data mining and statistical analysis
Once established that the selected genes were involved in common eye related pathways, it was necessary to determine the genotype-phenotype association with the pathology. We analyzed the 32 variant genotypes in all family members and noticed that the only variant with different distribution between healthy and sick subjects was CACNG8 c. � 6819A>T (S8 Fig). This observation was statistically confirmed by Fischer exact test, which associated the variant to disease (p-value = 0.040). However, these data did not suffice to explain the wide spectrum of different CORD phenotypes. To discriminate family phenotypes, we tried to associate other selected variants to clinical symptoms, which were quite varied in the affected family members.  (Fig 8A), we assigned different colors to differentiate between: I. irrelevant differences between affected and unaffected groups (-0.05�d<0.05, black); II. low differences between groups, likely having a pathogenic, causative (0.05�d<1, orange), or a protective (-1<d�-0.05, cyan) role against symptom manifestation; III. relevant differences between groups, linked to stronger evidence of a pathogenic, causative (d> = 1, red) or protective (d�-1, blue) role against symptom manifestation.
In Fig 8A, white stripes correspond to symptoms that were common to all patients; in such cases, a differentiation between affected and unaffected groups could not be obtained. In Fig  8B, we show scores assigned to each variant with a stem plot; a high score indicates that the presence of such variant likely plays a relevant role in symptom manifestation and/or absence. From Fig 8B, it is clear that some variants play a bigger role than others in symptom patterns. To help interpretation, in Fig 8B we further represented median score by means of horizontal line median score (black line), as well as the summation of median score with one (blue line) and two (red line) standard deviations. According to these analyses, it resulted that CACNG8 c. � 6819A>T departs two standard deviations from the median, while the other eight genes (PAX2 c.-375C>A, RXRG c.875+6A>G, CCDC175 c.-79C>G and c.76 C>T, PDE4DIP c. � 39G>A and c. � 105 G>A, LTF c.85G>A and c.140A>G) fall between one and two standard deviations from the median. No other significant differences were observed for the other variants. These results led us to the idea that the heterogeneity of the phenotypes could be determined by the above variants, differently distributed amongst the family patients. Moreover, to highlight the similarity of family components, a clustering based on all variant genotypes (S9A and S9B Fig) and symptom (S9C and S9D Fig) distribution was performed, with a cut-off of 70% and 80% respectively, which revealed that the proband and two sons (II1 and II2) cluster together genetically, as (II3 and III1), while common clinical findings are found for I1, II3, II5 and III1, who appear healthy.

Population screening
Based on Exome Variant Server (http://evs.gs.washington.edu/EVS/), for all the variants identified, minor allele frequency (MAF) was at least 5% in all populations assayed. To determine

Discussion
CORD is a highly heterogeneous disease in clinical features and genetics, and association of mutations/variants in at least 30 genes to the various forms of the disease is not immediate [20]. Furthermore, it is highly likely that mutations/variants in other still unidentified genes, as well as non-coding RNAs [38,39], could contribute to CORD pathogenesis. The complexity of the situation is compounded by the variety of genotype-phenotype combinations that occur. Heterogeneity depends on various factors. First, the allelic factor: each gene may present various disease-causing mutations leading to the same phenotype. Secondly, the genetic factor: different genes, when mutated, might induce the same pathological phenotype. Thirdly, the By considering an additive model, for each symptom we calculated difference between average genotype for affected vs unaffected patients. This value was then assigned a score according to effect size (mild = 0.5, promising> = 1). Score is shown on a heatmap that reflects the strength (mild or strong) and putative causative (reds), or protective (blues) role of examined variants. White rows arise when all patients show a given symptom, thus making comparison of genotype distribution impossible. b) Stem plot shows total score obtained for a given variant when considering absolute value of effect sizes gathered across all symptoms; this measure represents the overall relevance of that variant in explaining symptom patterns. Lines highlight median (black), more one (blue) and two (red) standard deviations with respect to score distribution. https://doi.org/10.1371/journal.pone.0278857.g008

PLOS ONE
phenotypic factor: different mutations in the same gene can lead to different clinical pictures. Fourthly, the clinical factor: the same gene mutation can cause different signs and symptoms, even in members of the same family [40]. We analyzed the case of an undefined CORD affected family members showing a wide range of phenotypes. Such clinical heterogeneity could result from a combination of previously described factors. WES and WGS [41] permitted us to investigate the whole genome of the proband and find new potentially CORD-associated gene variants, which were then studied in the other family members. Such approach permitted to underline the multigenic nature of this group of retinal diseases, as well as the possible involvement of modifier genes into related phenotype heterogeneity.
The causative gene identified after all analyses resulted the GUCY2D (HGNC ID: 4689, OMIM: 600179) which encodes a retina-specific guanylate cyclase GC-E/RetGC1, which is highly expressed in the outer segment membranes of cones, and to a lesser extent in rods [42,43]. GC-E is a key enzyme in the phototransduction cascade, catalyzing the production of cGMP (cyclic guanosine monophosphate) from GTP (guanosine triphosphate) in a calciumsensitive manner [44,45].
One variant found within GUCY2D gene, c.2513G>C, is already known to cause an autosomal dominant form of cone-rod dystrophy [46][47][48]. The combination of this variant with the c.3044-7G>T, also in GUCY2D, might enforce the effects of GUCY2D mutated form.
However, the presence of very heterogeneous phenotype exhibited by all affected family members made us assume that such complexity might be determined not only by the combination of previous cited variants, but also by mutated modifiers genes. Thus, by applying previously described filtering criteria and combining data mining with a deep eye-related pathway analysis, nine variants, carried by six genes (PAX2, RXRG, CACNG8, CCDC175, PDE4DIP, and LTF), emerged as potentially associated to the different phenotypes.
Our exploratory working hypothesis, explained below, starts from the primary phototransduction impairment determined by mutated GUCY2D, and bases its novelty on the recent discovery regarding photosensitive retinal ganglion cells (ipRGCs), and their capacity of expressing melanopsin intrinsically [49]. These cells represent the third class of retinal photoreceptors whose axon collaterals constitute a centrifugal pathway to upstream dopaminergic amacrine cells (DACs) forwarding melanopsin-based signals from the innermost retina to the outer retina [50]. Melanopsin-based membrane depolarization could trigger glutamate release locally onto DACs, through activation of AMPA glutamate receptors, depolarizing DACs and increasing action potential firing frequency, triggering dopamine release [50]. Increased dopamine release mediated by ipRGC activity (which integrates light-adapted cone and melanopsin signals) can also act through D4 dopamine receptors expressed on cones. Cones are known to affect cGMP metabolism, gene expression, and rod-cone coupling, regulating cone photoreceptor adaptation. In addition, the loss of M1 subtype of ipRGCs attenuates light adaptation, as evidenced by an impaired electroretinogram b-wave from cones [50].
This scenario highlights the central role of cones, whose impairment is primary at disease onset in all affected family members, as evidenced by clinical exams and by altered color perception. Such cone compromised activity, originated from GUCY2Ddefects, might be differentially worsened by altered retrograde signaling from the inner retinal cells due to variants carried by candidate modifier genes.
The driver modifier activity might belong to c. � 6819A>T carried by CACNG8, which encodes the γ-8 member of transmembrane AMPA receptor-associated regulatory proteins (TARPs). TARPs control synaptic strength both by targeting AMPARs to synapses through an intracellular PDZ-binding motif and by regulating their gating through an extracellular domain [51]. Although the extracellular domain of γ-8 can determine slower AMPAR decay kinetics, it is constrained by an inhibitory influence of AMPAR channel gating exerted by the intracellular domains of γ-8 [52].As previously reported, the frequency of CACNG8 c. � 6819A>Tis extremely low in the Messina population. This variant is candidate to decrease the expression of TARP γ-8 physiologically, interacting with DACs and RGCs AMPARs and reducing both retrograde and anterograde signal transmissions. The arrest of the anterograde pathway may send a stop signal to RGCs, impairing optic nerve activity, (evidenced in II1 and II4) by altered evoked potentials. Interruption of retrograde signaling may involve cones, which consequently may die, due to inability to perform neuronal transmission function.
Additionally, the presence of the RXRG splicing variant c.875+6A>G, not frequent in the Messina population, may also alter the phenotypic proportion of L and S cones and their differentiation from birth, leading to confused color perception. Such hypothesis is supported by the well-established role of RXRG in cone differentiation, mediated by 9-cis retinoic acid [53,54].
An altered expression of PAX2 c.-375C>A could also decrease cone cell genesis and correct glial phenotype manifestation. That is a consequence of impairing astrocytes and other neural supporting cells, especially in RPE, preventing it from ensuring a trophic and protective role towards the underlying photoreceptor layer of the retina [55][56][57][58].
This scenario could promote cone and rod death, due to the lack of fundamental antiinflammatory activity of lactotransferrin [59,60], which failed because of the presence of LTF c.85G>A and c.140A>G and, probably, CCDC175 c.-79C>G and c.76 C>T.
Explaining the role of CCDC175, however, remains the most challenge aspect to be faced. Little is known about CCDC175 (coiled-coil domain containing 175, 14q23.1), especially in homo sapiens. The only available data come from the STRING database and highlight an interaction between the Mus musculus ccdc175 protein and IL23R (Interleukin 23 Receptor), which is associated with two eye-relevant and rare diseases, Behçet's Disease and Vogt-Koyanagi-Harada Syndrome [61][62][63]. We conducted a whole transcriptome analysis on human RPE cells which showed a decreased expression of CCDC175, after an oxidant chemical compound treatment. Therefore, we can say that expression of CCDC175 occurs in the human retina, as well as other considered genes, established by the same RNA-Seq experiment (submitted data, partially shown in S3 Table).
Finally, photoreceptor death may be due to reduced expression of PDE4DIP variants, the 3' UTR variants c. � 39G>A and c. � 105 G>A. These variants might determine the block of microtubule polymerization, preventing the proper transport of phototransduction proteins between the inner and the outer photoreceptor segments, mediated by USH transport network [64][65][66][67][68]. The hypothetical compromised functional scenario is represented in Fig 9. This pathophysiological model appears consistent with bioinformatic and literature data and is currently being experimentally tested.

Conclusions
We identified a complex set of candidate gene-variants closely associated with a form of CORD mainly determined by the consequences of mutated GUCY2D. These candidate genes encode several fundamental proteins involved in retina and eye physiology, and might act as modifier genes determining the heterogeneity of exhibited phenotypes.
Our discovery could permit to progress towards a better classification of all CORD forms and to shed light on the mechanisms of the disease group.
However, our pilot exploratory study presented several limitations, such as the restricted number of examined patients and the absence of functional essays needed to confirm the obtained results about genotypic and phenotypic association. Furthermore, the additive model evaluated is not warranted, and most interesting data came from WES data only. Additionally, even if important symptomatology differences are certain for affected members, others might be reduced during the disease course. Thus, next step will foresee experimental validation (e. g. Proposed regulatory pattern related to CORD pathogenic mechanisms. Our hypothesis is based on primary cone compromised activity, originated from GUCY2D defects, might worsened by altered retrograde signaling from the inner retinal cells due to variants carried by candidate modifier genes. CACNG8 c. � 6819A>T could decrease the expression of TARP γ-8 physiologically interacting with DACs and RGCs AMPARs, reducing the retrograde and anterograde signal transmission determining, respectively, an impairment of optic nerve activity and co death. Additionally, the presence of the RXRG c.875+6A>G and PAX2 c.-375C>A may have already compromised cones and their differentiation, probably involving astrocytes and other neural supporting cells, especially in RPE. This scenario could also promote rods death, due to the lack of the fundamental anti-inflammatory activity of lactotransferrin (failed because of LTF c.85G>A and c.140A>G and, probably, of CCDC175 c.-79C>G and c.76 C>T), and due to the reduced expression of PDE4DIP (caused by the two 3' UTR variants c. � 39G>A and c. � 105 G>A), which could prevent the proper transport of phototransduction proteins between the inner and the outer photoreceptor segments. More details in the text. https://doi.org/10.1371/journal.pone.0278857.g009

PLOS ONE
protein-protein interaction assay by NanoBiT 1 and electrophysiological functional assays) of analyzed data to give our findings more strength, as well as the involvement of a wider cohort of patients with genotype-phenotype correlation similar to our examined family.
In this way, our results could provide additional prognostic clues for genetic counseling and a better molecular characterization of patients, allowing them to be included in future clinical trials based on gene therapy. For each subject, genotypes of all analyzed variants are shown on a row of three heatmaps that are built according to an additive (a), dominant (b), or recessive (c) model, respectively. Each column thus represents the genotype of a given variant. In the additive model (a), yellow pixels correspond to wildtype, red ones to heterozygous mutation, and green pixels show mutations in homozygosis. In the dominant (b) and recessive models (c), red pixels highlight variants which were not considered as being mutated, whereas blue ones mean that, under the model considered, mutation occurred. (PDF) S9 Fig. Genotype-phenotype correlations. a) Genotypes distribution of each subject was compared with those of all the other subjects. Such analysis resulted in a score ranging between 0 (blue) and 1 (yellow); the higher this value, the more similar genotypes distribution between two given subjects. b) By applying a cutoff to such maps at a given percentage, only pixels corresponding to subject pairs whose genotype distributions were more similar persist; here a representative similarity cutoff of 70% was chosen. c) The same correspondence could be obtained by considering symptom patterns. After applying a strong cutoff (80% in this case), subjects showing a close correspondence in symptoms distribution are highlighted (d). (PDF)

S10 Fig. Frequency distribution of candidate variants through the population of Messina.
As evidenced by bar plots (a-i), at least two variants, CACNG8 rs66507429:A>T (a) and CCDC175c.76 C>T (e), show a very low frequency distribution in healthy population, suggesting a possible association with pathology. (PDF) S1  Table. ClueGO preliminary pathway analysis involving the 6 candidate genes. The Cytoscape ClueGO plug-in permitted us to cluster selected candidate genes into common pathways, and detailed results are shown in this table. GO-ID: ID associated to Gene Ontology. GO-Term: specific GO annotation. Ontology Source: databases from which was derived the annotation. Term p-value: statistical significance of predicted annotation. Term p-value Corrected with Bonferroni step down: statistical correction applied to previous statistical association analysis. Group p-value: statistical significance of predicted clustered annotations. Group p-value Corrected with Bonferroni step down: statistical correction applied to previous statistical clustering analysis. GO-Levels: Due to the complex structure of GO tree (directed acyclic graph), the GO terms were placed in several levels. In case of using sources without hierarchical structure (KEGG, REACTOME, WikiPathways), the level it is assigned as -1. GO-Groups: the group or the groups that include the term. % Associated Genes: percentage of the genes from the uploaded cluster that were associated with the term, compared with all the genes associated with the term. Nr. Genes: number of the genes from the uploaded cluster that were associated with the term. Gene Symbols: symbols of genes clustered in the same group. (XLSX) S3 Table. RNA-Seq experiment on RPE cells established PDE4DIP, PAX2, LTF, RXRG, CACNG8 and CCDC175 expression in the retina. A whole transcriptome analysis realized on RPE cells, treated with oxidant compounds versus untreated, (full data not shown) highlighted the expression changes of PDE4DIP, PAX2, LTF, RXRG, CACNG8 and CCDC175 genes in human retina. Fold-change 1h: expression difference between treated and untreated samples after 1h. Fold-change 2h: expression difference between treated and untreated samples after 2h. Fold-change 4h: expression difference between treated and untreated samples after 4h. Fold-change 1h: expression difference between treated and untreated samples after 6h. GO biological process, GO cellular component and GO molecular function: annotations of selected gene by Gene Ontology main categories. Database object name: extended name of analyzed gene. (XLSX)